HimL (Health in my Language)

نویسنده

  • Barry Haddow
چکیده

HimL (www.himl.eu) is a three-year EU H2020 innovation action, which started in February 2015. Its aim is to increase the availability of public health information via automatic translation. Targeting languages of Central and Eastern Europe (Czech, German, Polish and Romanian) we aim to produce translations which are adapted to the health domain, semantically accurate and morphologically correct. The project is coordinated by Barry Haddow (University of Edinburgh) and includes two additional academic partners (Charles University and LMU Munich), one integration partner (Lingea) and two user partners (NHS 24 and Cochrane). Description In HimL we aim to deploy and evaluate machine translation systems for the public health domain, addressing domain adaptation, semantic accuracy and target morphology. The systems are used to tramslate content for NHS 24 (Scotland's national telehealth organisation) and Cochrane (an international NGO that produces systematic reviews of healthcare topics). The project has now been running for over a year, and we have already developed the first release of our translation systems and used them to translate the user partner websites. To build these systems, we have collected a large and diverse training set and are analysing the performance of existing domain adaptation techniques in combining these resources, as well as investigating the use of neural models in domain adaptation. We have been developing our corrective approaches to morphology handling, using machine learning to provide language independence, as well as extending the two-step approach to morphology to handle a wider range of phenomena, and new language pairs. For improved semantic accuracy we have experimented with using semantic roles to make sure important information is not lost, as well as developing methods to remove semantically incorrect translations from the model, and analysing the problems that arise in the translation of negation. The goal of semantic accuracy is supported by our development of new human and automatic semantic evaluation measures based on the UCCA (universal conceptual cognitive annotation) framework.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TmTriangulate: A Tool for Phrase Table Triangulation

This work was supported by the grants no 645452 (QT21) and no 644402 (HimL) of the EU and SVV 260 104 of the Czech Republic. We used language resources hosted by the LINDAT/CLARIN project LM2010013 of the Ministry of Education, Youth and Sports. Introduction Under-resourced language pair: Scarcity of parallel corpora SMT Problem: No direct data → no SMT training Insufficient data → poor SMT per...

متن کامل

This Is My (Post) Truth, Tell Me Yours; Comment on “The Rise of Post-truth Populism in Pluralist Liberal Democracies: Challenges for Health Policy”

This is a commentary on the article ‘The rise of post-truth populism in pluralist liberal democracies: challenges for health policy.’ It critically examines two of its key concepts: populism and ‘post truth.’ This commentary argues that there are different types of populism, with unclear links to impacts, and that in some ways, ‘post-truth’ has resonances with arguments advanced in the period a...

متن کامل

Caliban's Meaning: The Culture of Language

Drawing largely on Aidoo’s (1970) play, Anowa, as well as lived experiences, I argue on the philosophical flaws of Ashcroft’s (2009) claim that there is no inherent link between language and culture. This essay subsequently explores the implication of my argument on some transformational domains of English in particular though it has obvious applicability to the role of colonial languages in ge...

متن کامل

The QT21/HimL Combined Machine Translation System

This paper describes the joint submission of the QT21 and HimL projects for the English→Romanian translation task of the ACL 2016 First Conference on Machine Translation (WMT 2016). The submission is a system combination which combines twelve different statistical machine translation systems provided by the different groups (RWTH Aachen University, LMU Munich, Charles University in Prague, Univ...

متن کامل

Editorial Volume 5, Issue 2

Our Journal's tendency towards the real world in applied linguistics and literary studies should have significant epistemological and methodological consequences in researching the fields. The interest in the real world makes the problems we may have in our everyday lives our 'points of departure' in research. According to my experience of research in our universities throughout their history, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015